2 Compact Suffix Arrays

نویسنده

  • Nick Harvey
چکیده

The suffix array data structure that we present is due to Grossi and Vitter [1]. It uses a recursive construction that inflates the alphabet size, much like the the suffix array construction that we saw in Lecture 18. Building on this, we will construct a low-space suffix tree by augmenting this suffix array with an additional tree structure. This construction is due to Munro, Raman and Rao [2], and it relies on techniques that we saw in the previous lecture.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Compact Suffix Trees Resemble PATRICIA Tries: Limiting Distribution of the Depth

Suffix trees are the most frequently used data structures in algorithms on words. In this paper, we consider the depth of a compact suffix tree, also known as the PAT tree, under some simple probabilistic assumptions. For a biased memoryless source, we prove that the limiting distribution for the depth in a PAT tree is the same as the limiting distribution for the depth in a PATRICIA trie, even...

متن کامل

The Virtual Suffix Tree: An Efficient Data Structure for Suffix Trees and Suffix Arrays

We introduce the VST (virtual suffix tree), an efficient data structure for suffix trees and suffix arrays. Starting from the suffix array, we construct the suffix tree, from which we derive the virtual suffix tree. The VST provides the same functionality as the suffix tree, including suffix links, but at a much smaller space requirement. It has the same linear time construction even for large ...

متن کامل

Suffix Arrays on Words

Surprisingly enough, it is not yet known how to build directly a suffix array that indexes just the k positions at word-boundaries of a text T [1, n], taking O(n) time and O(k) space in addition to T . We propose a class-note solution to this problem that achieves such optimal time and space bounds. Word-based versions of indexes achieving the same time/space bounds were already known for suffi...

متن کامل

A Compact RDF Store Using Suffix Arrays

RDF has become a standard format to describe resources in the Semantic Web and other scenarios. RDF data is composed of triples (subject, predicate, object), referring respectively to a resource, a property of that resource, and the value of such property. Compact storage schemes allow fitting larger datasets in main memory for faster processing. On the other hand, supporting efficient SPARQL q...

متن کامل

On the combinatorics of suffix arrays

We prove several combinatorial properties of suffix arrays, including a characterization of suffix arrays through a bijection with a certain well-defined class of permutations. Our approach is based on the characterization of Burrows-Wheeler arrays given in [1], that we apply by reducing suffix sorting to cyclic shift sorting through the use of an additional sentinel symbol. We show that the ch...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2005